Conversational Telephone Speech Corpus Collection for the NIST Speaker Recognition Evaluation 2004
نویسندگان
چکیده
This paper discusses some of the factors that should be considered when designing a speech corpus collection to be used for textindependent speaker recognition evaluation. The factors include telephone handset type, telephone transmission type, language, and (non-telephone) microphone type. The paper describes the design of the new corpus collection being undertaken by the Linguistic Data Consortium (LDC) to support the 2004 and subsequent NIST speech recognition evaluations. Some preliminary information on the resulting 2004 evaluation test set is offered.
منابع مشابه
Call My Net Corpus: A Multilingual Corpus for Evaluation of Speaker Recognition Technology
The Call My Net 2015 (CMN15) corpus presents a new resource for Speaker Recognition Evaluation and related technologies. The corpus includes conversational telephone speech recordings for a total of 220 speakers spanning 4 languages: Tagalog, Cantonese, Mandarin and Cebuano. The corpus includes 10 calls per speaker made under a variety of noise conditions. Calls were manually audited for langua...
متن کاملHKUST/MTS: A Very Large Scale Mandarin Telephone Speech Corpus
The paper describes the design, collection, transcription and analysis of 200 hours of HKUST Mandarin Telephone Speech Corpus (HKUST/MTS) from over 2100 Mandarin speakers in mainland China under the DARPA EARS framework. The corpus includes speech data, transcriptions and speaker demographic information. The speech data include 1206 ten-minute natural Mandarin conversations between either stran...
متن کاملNIST 2008 speaker recognition evaluation: performance across telephone and room microphone channels
We describe the 2008 NIST Speaker Recognition Evaluation, including the speech data used, the test conditions included, the participants, and some of the performance results obtained. This evaluation was distinguished by including as part of the required test condition interview type speech as well as conversational telephone speech, and speech recorded over microphone channels as well as speec...
متن کاملThe 1999 NIST speaker recognition evaluation, using summed two-channel telephone data for speaker detection and speaker tracking
The 1999 NIST Speaker Recognition Evaluation encompassed three tasks: one-speaker detection, two-speaker detection, and speaker tracking. All tasks were performed in the context of conversational telephone speech. The one-speaker task used single channel mu-law data; the other tasks used summed twochannel data. Twelve sites from the United States, Europe, and India participated in the evaluatio...
متن کاملThe NIST Meeting Room Pilot Corpus
One of the next big challenges in Automatic Speech Recognition (ASR) is the transcription of speech in meetings. This task is particularly problematic for current recognition technologies because, in most realistic meeting scenarios, the vocabularies are unconstrained, the speech is spontaneous and often overlapping, and the microphones are inconspicuously placed. To support the development of ...
متن کامل